From a Stream of Relational Queries to Distributed Stream Processing
نویسندگان
چکیده
Applications from several domains are now being written to process live data originating from hardware and softwarebased streaming sources. Many of these applications have been written relying solely on database and data warehouse technologies, despite their lack of need for transactional support and ACID properties. In several extreme high-load cases, this approach does not scale to the processing speeds that these applications demand. In this paper we demonstrate an application acceleration approach whereby a regular ODBC-based application is converted into a true streaming application with minimal disruption from a software engineering standpoint. We showcase our approach on three real-world applications. We experimentally demonstrate the substantial performance improvements that can be observed when contrasting the accelerated implementation with the original database-oriented implementation.
منابع مشابه
Ivanova Scalable Scientific Stream Query Processing
Ivanova, M. 2005. Scalable Scientific Stream Query Processing. Acta Universitatis Upsaliensis. Uppsala Dissertations from the Faculty of Science and Technology 66. 137 pp. Uppsala. ISBN 91-554-6351-7 Scientific applications require processing of high-volume on-line streams of numerical data from instruments and simulations. In order to extract information and detect interesting patterns in thes...
متن کاملVerteilung globaler Anfragen auf heterogene Stromverarbeitungssysteme
Deployment of Global Queries in Distributed and Heterogeneous StreamProcessing Systems Distributed in-network stream processing is more efficient than sending all data to a central processing unit. In the past few years Stream-Processing Systems (SPSs) have established themselves as an interesting alternative to database systems for continuous query processing. There are many scenarios having w...
متن کاملCustomizable Parallel Execution of Scientific Stream Queries
Scientific applications require processing highvolume on-line streams of numerical data from instruments and simulations. We present an extensible stream database system that allows scalable and flexible continuous queries on such streams. Application dependent streams and query functions are defined through an object-relational model. Distributed execution plans for continuous queries are desc...
متن کاملA Quality-Centric Data Model for Distributed Stream Management Systems
It is challenging for large-scale stream management systems to return always perfect results when processing data streams originating from distributed sources. Data sources and intermediate processing nodes may fail during the lifetime of a stream query. In addition, individual nodes may become overloaded due to processing demands. In practice, users have to accept incomplete or inaccurate quer...
متن کاملComet: Batched Stream Processing in Data Intensive Distributed Computing
Performance and resource optimization is an important research problem in data intensive distributed computing. We present a new batched stream processing model that captures query correlations to expose I/O and computation redundancies for optimizations. The model is inspired by our empirical study on a trace from a production large-scale data processing cluster, which reveals significant redu...
متن کاملPredictable Performance for Unpredictable Workloads
This paper introduces Crescando: a scalable, distributed relational table implementation designed to perform large numbers of queries and updates with guaranteed access latency and data freshness. To this end, Crescando leverages a number of modern query processing techniques and hardware trends. Specifically, Crescando is based on parallel, collaborative scans in main memory and so-called “que...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- PVLDB
دوره 3 شماره
صفحات -
تاریخ انتشار 2010